Discovering Latent Structures: Experience with the CoIL Challenge 2000 Data Set
نویسنده
چکیده
We present a case study to demonstrate the possibility of discovering complex and interesting latent structures using hierarchical latent class (HLC) models. A similar effort was made earlier [6], but that study involved only small applications with 4 or 5 observed variables. Due to recent progress in algorithm research, it is now possible to learn HLC models with dozens of observed variables. We have successfully analyzed a version the CoIL Challenge 2000 data set that consists of 42 observed variable. The model obtained consists of 22 latent variables, and its structure is intuitively appealing.
منابع مشابه
Discovery of latent structures: Experience with the CoIL Challenge 2000 data set
The authors present a case study to demonstrate the possibility of discovering complex and interesting latent structures using hierarchical latent class (HLC) models. A similar effort was made earlier by Zhang (2002), but that study involved only small applications with 4 or 5 observed variables and no more than 2 latent variables due to the lack of efficient learning algorithms. Significant pr...
متن کاملA Data Mining Case Study
The CoIL Challenge 2000 offered the opportunity to apply data mining methods to a real-world data set. This paper describes an approach to solve the tasks of the Challenge. It also points the reader to the theoretical background of the subgroup analysis algorithm Midos. Its implementation for the data mining system Kepler was most intensively used for the investigations outlined in the text. Th...
متن کاملAn Integrated DEA and Data Mining Approach for Performance Assessment
This paper presents a data envelopment analysis (DEA) model combined with Bootstrapping to assess performance of one of the Data mining Algorithms. We applied a two-step process for performance productivity analysis of insurance branches within a case study. First, using a DEA model, the study analyzes the productivity of eighteen decision-making units (DMUs). Using a Malmquist index, DEA deter...
متن کاملImage alignment via kernelized feature learning
Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...
متن کاملProbabilistic Community Discovery Using Hierarchical Latent Gaussian Mixture Model
Complex networks exist in a wide array of diverse domains, ranging from biology, sociology, and computer science. These real-world networks, while disparate in nature, often comprise of a set of loose clusters(a.k.a communities), whose members are better connected to each other than to the rest of the network. Discovering such inherent community structures can lead to deeper understanding about...
متن کامل